AITopics | bernoulli process

Collaborating Authors

bernoulli process

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

This paper addresses the growing concern of cascading extreme events, such as an extreme earthquake followed by a tsunami, by presenting a novel method for risk assessment focused on these domino effects. The proposed approach develops an extreme value theory framework within a Kolmogorov-Arnold network (KAN) to estimate the probability of one extreme event triggering another, conditionally on a feature vector. An extra layer is added to the KAN's architecture to enforce the definition of the parameter of interest within the unit interval, and we refer to the resulting neural model as KANE (KAN with Natural Enforcement). The proposed method is backed by exhaustive numerical studies and further illustrated with real-world applications to seismology and climatology.

artificial intelligence, kolmogorov-arnold neural model, machine learning, (18 more...)

arXiv.org Machine Learning

2505.1337

Country:

South America > Chile (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

Restricting exchangeable nonparametric distributions

Neural Information Processing SystemsMar-13-2024, 16:07:29 GMT

Distributions over matrices with exchangeable rows and infinitely many columns are useful in constructing nonparametric latent variable models. However, the distribution implied by such models over the number of features exhibited by each data point may be poorly-suited for many modeling tasks. In this paper, we propose a class of exchangeable nonparametric priors obtained by restricting the domain of existing models. Such models allow us to specify the distribution over the number of features per data point, and can achieve better performance on data sets where the number of features is not well-modeled by the original distribution.

conditional distribution, ibp, matrix, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Ohio (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

In-Context Learning Dynamics with Random Binary Sequences

Bigelow, Eric J., Lubana, Ekdeep Singh, Dick, Robert P., Tanaka, Hidenori, Ullman, Tomer D.

arXiv.org Artificial IntelligenceNov-27-2023

Large language models (LLMs) trained on huge corpora of text datasets demonstrate intriguing capabilities, achieving state-of-the-art performance on tasks they were not explicitly trained for. The precise nature of LLM capabilities is often mysterious, and different prompts can elicit different capabilities through in-context learning. We propose a framework that enables us to analyze in-context learning dynamics to understand latent concepts underlying LLMs' behavioral patterns. This provides a more nuanced understanding than success-or-failure evaluation benchmarks, but does not require observing internal activations as a mechanistic interpretation of circuits would. Inspired by the cognitive science of human randomness perception, we use random binary sequences as context and study dynamics of in-context learning by manipulating properties of context data, such as sequence length. In the latest GPT-3.5+ models, we find emergent abilities to generate seemingly random numbers and learn basic formal languages, with striking in-context learning dynamics where model outputs transition sharply from seemingly random behaviors to deterministic repetition.

arxiv preprint arxiv, probability, sequence, (14 more...)

arXiv.org Artificial Intelligence

2310.17639

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry:

Education > Curriculum > Subject-Specific Education (0.46)
Health & Medicine > Therapeutic Area (0.46)

Add feedback

A Chain Rule for the Expected Suprema of Bernoulli Processes

Chu, Yifeng, Raginsky, Maxim

arXiv.org Artificial IntelligenceApr-27-2023

Sharp bounds on the suprema of Bernoulli processes are ubiquitous in applications of probability theory. For example, in statistical learning theory, they arise as so-called Rademacher complexities that quantify the effective "size" of a function class in the context of a given learning task [1]. To evaluate the generalization error bound of a learning algorithm, one usually needs to obtain a good estimate on the (empirical) Rademacher complexity of a suitable function class. In many scenarios, the function class at hand is composite, and separate assumptions are imposed on the constituent classes forming the composition. Thus, it is desirable to obtain a bound that takes into account the properties of the function classes in the composition.

artificial intelligence, bernoulli process, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2304.14474

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.54)

Add feedback

The Bernoulli and Binomial Distributions

#artificialintelligenceMar-1-2021, 03:25:57 GMT

The probability for a discrete random variable can be summarized with a discrete probability distribution. Discrete probability distributions are used in machine learning, most notably in the modeling of binary and multi-class classification problems, but also in evaluating the performance for binary classification models, such as the calculation of confidence intervals, and in the modeling of the distribution of words in text for natural language processing. Knowledge of discrete probability distributions is also required in the choice of activation functions in the output layer of deep learning neural networks for classification tasks and selecting an appropriate loss function. Discrete probability distributions play an important role in applied machine learning and there are a few distributions that a practitioner must know about. In this tutorial, you will discover discrete probability distributions (Bernoulli and Binomial Distribution) used in machine learning.

experiment, probability, probability distribution, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

An Algorithm for Fuzzification of WordNets, Supported by a Mathematical Proof

Hossayni, Sayyed-Ali, Akbarzadeh-T, Mohammad-R, Recupero, Diego Reforgiato, Gangemi, Aldo, Del Acebo, Esteve, Esteva, Josep Lluís de la Rosa i

arXiv.org Artificial IntelligenceJun-7-2020

WordNet-like Lexical Databases (WLDs) group English words into sets of synonyms called "synsets." Although the standard WLDs are being used in many successful Text-Mining applications, they have the limitation that word-senses are considered to represent the meaning associated to their corresponding synsets, to the same degree, which is not generally true. In order to overcome this limitation, several fuzzy versions of synsets have been proposed. A common trait of these studies is that, to the best of our knowledge, they do not aim to produce fuzzified versions of the existing WLD's, but build new WLDs from scratch, which has limited the attention received from the Text-Mining community, many of whose resources and applications are based on the existing WLDs. In this study, we present an algorithm for constructing fuzzy versions of WLDs of any language, given a corpus of documents and a word-sense disambiguation (WSD) system for that language. Then, using the Open-American-National-Corpus and UKB WSD as algorithm inputs, we construct and publish online the fuzzified version of English WordNet (FWN). We also propose a theoretical (mathematical) proof of the validity of its results.

artificial intelligence, natural language, text processing, (18 more...)

arXiv.org Artificial Intelligence

2006.04042

Country:

Europe > Spain > Catalonia > Girona Province > Girona (0.04)
Europe > Italy > Sardinia > Cagliari (0.04)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
(2 more...)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Convex Recovery of Marked Spatio-Temporal Point Processes

Juditsky, Anatoli, Nemirovski, Arkadi, Xie, Liyan, Xie, Yao

arXiv.org Machine LearningMar-28-2020

We present a multi-dimensional Bernoulli process model for spatial-temporal discrete event data with categorical marks, where the probability of an event of a specific category in a location may be influenced by past events at this and other locations. The focus is to introduce general forms of influence function which can capture an arbitrary shape of influence from historical events, between locations, and between different categories of events. The general form of influence function differs from the commonly adapted exponential delaying function over time, and more importantly, in our model, we can learn the delayed influence of prior events, which is an aspect seemingly largely ignored in prior literature. Prior knowledge or assumptions on the influence function are incorporated into our framework by allowing general convex constraints on the parameters specifying the influence function. We develop two approaches for recovering these parameters, using the constrained least-square (LS) and maximum likelihood (ML) estimations. We demonstrate the performance of our approach on synthetic examples and illustrate its promise using real data (crime data and novel coronavirus data), in extracting knowledge about the general influences and making predictions.

influence function, ml estimate, probability, (17 more...)

arXiv.org Machine Learning

2003.12935

Country:

Asia > Mongolia (0.04)
Asia > China > Tibet Autonomous Region (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(9 more...)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

The Weighted Tsetlin Machine: Compressed Representations with Weighted Clauses

Phoulady, Adrian, Granmo, Ole-Christoffer, Gorji, Saeed Rahimi, Phoulady, Hady Ahmady

arXiv.org Machine LearningNov-28-2019

The Tsetlin Machine (TM) is an interpretable mechanism for pattern recognition that constructs conjunctive clauses from data. The clauses capture frequent patterns with high discriminating power, providing increasing expression power with each additional clause. However, the resulting accuracy gain comes at the cost of linear growth in computation time and memory usage. In this paper, we present the Weighted Tsetlin Machine (WTM), which reduces computation time and memory usage by \emph{weighting} the clauses. Real-valued weighting allows one clause to replace multiple and supports fine-tuning the impact of each clause. Our novel scheme simultaneously learns both the composition of the clauses and their weights. Furthermore, we increase training efficiency by replacing $k$ Bernoulli trials of success probability $p$ with a uniform sample of average size $p k$, the size drawn from a binomial distribution. In our empirical evaluation, the WTM achieved the same accuracy as the TM on MNIST, IMDb, and Connect-4, requiring only $1/4$, $1/3$, and $1/50$ of the clauses, respectively. With the same number of clauses, the WTM outperformed the TM, obtaining peak test accuracies of respectively $98.58\%$, $90.15\%$, and $87.49\%$. Finally, our novel sampling scheme reduced sample generation time by a factor of $7$.

automata, tsetlin machine, wtm, (15 more...)

arXiv.org Machine Learning

1911.12607

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > California > Sacramento County > Sacramento (0.04)
Europe > Norway (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)

Add feedback